62 research outputs found

    HSRA: Hadoop-based spliced read aligner for RNA sequencing data

    Get PDF
    [Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference genome or transcriptome is considered a crucial step that remains as one of the most time-consuming. With the steady development of Next Generation Sequencing (NGS) technologies, unprecedented amounts of genomic data introduce significant challenges in terms of storage, processing and downstream analysis. As cost and throughput continue to improve, there is a growing need for new software solutions that minimize the impact of increasing data volume on RNA read alignment. In this work we introduce HSRA, a Big Data tool that takes advantage of the MapReduce programming model to extend the multithreading capabilities of a state-of-the-art spliced read aligner for RNA-seq data (HISAT2) to distributed memory systems such as multi-core clusters or cloud platforms. HSRA has been built upon the Hadoop MapReduce framework and supports both single- and paired-end reads from FASTQ/FASTA datasets, providing output alignments in SAM format. The design of HSRA has been carefully optimized to avoid the main limitations and major causes of inefficiency found in previous Big Data mapping tools, which cannot fully exploit the raw performance of the underlying aligner. On a 16-node multi-core cluster, HSRA is on average 2.3 times faster than previous Hadoop-based tools. Source code in Java as well as a user’s guide are publicly available for download at http://hsra.dec.udc.es.Ministerio de Economía, Industria y Competitividad; TIN2016-75845-PXunta de Galicia; ED431G/0

    Hydrophobic CDR3 residues promote the development of self-reactive T cells

    Get PDF
    Studies of individual T cell antigen receptors (TCRs) have shed some light on structural features that underlie self-reactivity. However, the general rules that can be used to predict whether TCRs are self-reactive have not been fully elucidated. Here we found that the interfacial hydrophobicity of amino acids at positions 6 and 7 of the complementarity-determining region CDR3β robustly promoted the development of self-reactive TCRs. This property was found irrespective of the member of the β-chain variable region (V[subscript β]) family present in the TCR or the length of the CDR3β. An index based on these findings distinguished V[subscript β]2[superscript +], V[subscript β]6[superscript +] and V[subscript β]8.2[superscript +] regulatory T cells from conventional T cells and also distinguished CD4[superscript +] T cells selected by the major histocompatibility complex (MHC) class II molecule I-A[superscript g7] (associated with the development of type 1 diabetes in NOD mice) from those selected by a non–autoimmunity-promoting MHC class II molecule I-Ab. Our results provide a means for distinguishing normal T cell repertoires versus autoimmunity-prone T cell repertoires
    corecore